Overview

Dataset statistics

Number of variables13
Number of observations2790
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory283.5 KiB
Average record size in memory104.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with qt_items and 3 other fieldsHigh correlation
qt_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qt_invoice is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qt_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_unique_basket_size is highly correlated with qt_products and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qt_items and 1 other fieldsHigh correlation
qt_items is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qt_invoice is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qt_products is highly correlated with qt_invoice and 1 other fieldsHigh correlation
avg_ticket is highly correlated with qt_returns and 1 other fieldsHigh correlation
qt_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qt_productsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qt_items and 1 other fieldsHigh correlation
qt_items is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qt_invoice is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qt_products is highly correlated with avg_unique_basket_sizeHigh correlation
avg_unique_basket_size is highly correlated with qt_productsHigh correlation
avg_basket_size is highly correlated with qt_itemsHigh correlation
gross_revenue is highly correlated with qt_items and 5 other fieldsHigh correlation
qt_items is highly correlated with gross_revenue and 5 other fieldsHigh correlation
qt_invoice is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qt_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qt_returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qt_productsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 51.95952231) Skewed
frequency is highly skewed (γ1 = 47.28301617) Skewed
qt_returns is highly skewed (γ1 = 49.28702505) Skewed
avg_basket_size is highly skewed (γ1 = 45.04601647) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 34 (1.2%) zeros Zeros
qt_returns has 1456 (52.2%) zeros Zeros

Reproduction

Analysis started2022-03-26 14:59:45.818052
Analysis finished2022-03-26 15:00:38.115295
Duration52.3 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2790
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2309.910753
Minimum0
Maximum5887
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:38.272200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile182.45
Q1920.25
median2100
Q33498.75
95-th percentile5127.55
Maximum5887
Range5887
Interquartile range (IQR)2578.5

Descriptive statistics

Standard deviation1577.346218
Coefficient of variation (CV)0.6828602429
Kurtosis-0.9354811747
Mean2309.910753
Median Absolute Deviation (MAD)1267
Skewness0.3979437407
Sum6444651
Variance2488021.09
MonotonicityStrictly increasing
2022-03-26T12:00:38.609020image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
29761
 
< 0.1%
29631
 
< 0.1%
29641
 
< 0.1%
29651
 
< 0.1%
29661
 
< 0.1%
29681
 
< 0.1%
29691
 
< 0.1%
29731
 
< 0.1%
29741
 
< 0.1%
Other values (2780)2780
99.6%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
58871
< 0.1%
58771
< 0.1%
58711
< 0.1%
58461
< 0.1%
58401
< 0.1%
58291
< 0.1%
58281
< 0.1%
58111
< 0.1%
58101
< 0.1%
58081
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2790
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15281.86846
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:38.916025image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12620.45
Q113812.25
median15240.5
Q316778.75
95-th percentile17950.55
Maximum18287
Range5940
Interquartile range (IQR)2966.5

Descriptive statistics

Standard deviation1717.476402
Coefficient of variation (CV)0.1123865453
Kurtosis-1.207312643
Mean15281.86846
Median Absolute Deviation (MAD)1485
Skewness0.01466970702
Sum42636413
Variance2949725.19
MonotonicityNot monotonic
2022-03-26T12:00:39.212274image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
168061
 
< 0.1%
162491
 
< 0.1%
141981
 
< 0.1%
139891
 
< 0.1%
179301
 
< 0.1%
144821
 
< 0.1%
141631
 
< 0.1%
138111
 
< 0.1%
124571
 
< 0.1%
Other values (2780)2780
99.6%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123631
< 0.1%
123641
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182651
< 0.1%
182631
< 0.1%
182611
< 0.1%
182601
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2773
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2941.458523
Minimum6.9
Maximum280206.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:39.527714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6.9
5-th percentile263.363
Q1628.9125
median1192.93
Q32474.305
95-th percentile7813.0015
Maximum280206.02
Range280199.12
Interquartile range (IQR)1845.3925

Descriptive statistics

Standard deviation10981.33843
Coefficient of variation (CV)3.733297052
Kurtosis327.0532004
Mean2941.458523
Median Absolute Deviation (MAD)703.095
Skewness16.11726891
Sum8206669.28
Variance120589793.8
MonotonicityNot monotonic
2022-03-26T12:00:39.804152image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
889.932
 
0.1%
1353.742
 
0.1%
1418.032
 
0.1%
745.062
 
0.1%
2092.322
 
0.1%
178.962
 
0.1%
618.092
 
0.1%
734.942
 
0.1%
379.652
 
0.1%
1314.452
 
0.1%
Other values (2763)2770
99.3%
ValueCountFrequency (%)
6.91
< 0.1%
36.561
< 0.1%
521
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
70.021
< 0.1%
77.41
< 0.1%
84.651
< 0.1%
90.31
< 0.1%
ValueCountFrequency (%)
280206.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
143825.061
< 0.1%
124914.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
81024.841
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

ZEROS

Distinct251
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.7921147
Minimum0
Maximum372
Zeros34
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:40.099190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q110
median29
Q373
95-th percentile211
Maximum372
Range372
Interquartile range (IQR)63

Descriptive statistics

Standard deviation68.2991265
Coefficient of variation (CV)1.202616364
Kurtosis3.350490882
Mean56.7921147
Median Absolute Deviation (MAD)23.5
Skewness1.881292285
Sum158450
Variance4664.77068
MonotonicityNot monotonic
2022-03-26T12:00:40.381854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.5%
487
 
3.1%
285
 
3.0%
385
 
3.0%
876
 
2.7%
1067
 
2.4%
967
 
2.4%
765
 
2.3%
1762
 
2.2%
2255
 
2.0%
Other values (241)2042
73.2%
ValueCountFrequency (%)
034
 
1.2%
199
3.5%
285
3.0%
385
3.0%
487
3.1%
543
1.5%
765
2.3%
876
2.7%
967
2.4%
1067
2.4%
ValueCountFrequency (%)
3721
 
< 0.1%
3661
 
< 0.1%
3601
 
< 0.1%
3583
0.1%
3371
 
< 0.1%
3362
0.1%
3341
 
< 0.1%
3332
0.1%
3301
 
< 0.1%
3261
 
< 0.1%

qt_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1642
Distinct (%)58.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1698.045878
Minimum2
Maximum196915
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:40.693750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile117.45
Q1330
median700.5
Q31478.75
95-th percentile4635.75
Maximum196915
Range196913
Interquartile range (IQR)1148.75

Descriptive statistics

Standard deviation6068.875091
Coefficient of variation (CV)3.574034818
Kurtosis438.7014545
Mean1698.045878
Median Absolute Deviation (MAD)450
Skewness17.32924852
Sum4737548
Variance36831244.88
MonotonicityNot monotonic
2022-03-26T12:00:41.001597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30010
 
0.4%
31010
 
0.4%
1508
 
0.3%
3948
 
0.3%
888
 
0.3%
2197
 
0.3%
2467
 
0.3%
3067
 
0.3%
2727
 
0.3%
4937
 
0.3%
Other values (1632)2711
97.2%
ValueCountFrequency (%)
21
< 0.1%
31
< 0.1%
161
< 0.1%
171
< 0.1%
191
< 0.1%
201
< 0.1%
251
< 0.1%
272
0.1%
301
< 0.1%
321
< 0.1%
ValueCountFrequency (%)
1969151
< 0.1%
809971
< 0.1%
802651
< 0.1%
773741
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

qt_invoice
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct58
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.065232975
Minimum2
Maximum209
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:41.315608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q12
median4
Q37
95-th percentile17
Maximum209
Range207
Interquartile range (IQR)5

Descriptive statistics

Standard deviation9.116357263
Coefficient of variation (CV)1.503051457
Kurtosis187.7369859
Mean6.065232975
Median Absolute Deviation (MAD)2
Skewness10.72802592
Sum16922
Variance83.10796974
MonotonicityNot monotonic
2022-03-26T12:00:41.619299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2785
28.1%
3505
18.1%
4386
13.8%
5242
 
8.7%
6172
 
6.2%
7143
 
5.1%
898
 
3.5%
968
 
2.4%
1054
 
1.9%
1152
 
1.9%
Other values (48)285
 
10.2%
ValueCountFrequency (%)
2785
28.1%
3505
18.1%
4386
13.8%
5242
 
8.7%
6172
 
6.2%
7143
 
5.1%
898
 
3.5%
968
 
2.4%
1054
 
1.9%
1152
 
1.9%
ValueCountFrequency (%)
2091
< 0.1%
2011
< 0.1%
1241
< 0.1%
971
< 0.1%
931
< 0.1%
911
< 0.1%
861
< 0.1%
731
< 0.1%
631
< 0.1%
621
< 0.1%

qt_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct341
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.07168459
Minimum1
Maximum1786
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:41.923299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q129
median56
Q3105
95-th percentile239.55
Maximum1786
Range1785
Interquartile range (IQR)76

Descriptive statistics

Standard deviation98.60104224
Coefficient of variation (CV)1.186939241
Kurtosis80.64019576
Mean83.07168459
Median Absolute Deviation (MAD)33
Skewness6.347933357
Sum231770
Variance9722.16553
MonotonicityNot monotonic
2022-03-26T12:00:42.229776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3739
 
1.4%
2437
 
1.3%
2636
 
1.3%
2535
 
1.3%
3335
 
1.3%
2834
 
1.2%
1832
 
1.1%
3032
 
1.1%
1530
 
1.1%
2330
 
1.1%
Other values (331)2450
87.8%
ValueCountFrequency (%)
121
0.8%
213
0.5%
318
0.6%
418
0.6%
523
0.8%
619
0.7%
722
0.8%
824
0.9%
923
0.8%
1020
0.7%
ValueCountFrequency (%)
17861
< 0.1%
17661
< 0.1%
13221
< 0.1%
11181
< 0.1%
8841
< 0.1%
8171
< 0.1%
7171
< 0.1%
7141
< 0.1%
6991
< 0.1%
6361
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2788
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.9729736
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:42.532737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.851708778
Q112.51992731
median18.1679386
Q325.34082943
95-th percentile89.01555123
Maximum56157.5
Range56155.34941
Interquartile range (IQR)12.82090212

Descriptive statistics

Standard deviation1068.595594
Coefficient of variation (CV)20.17246762
Kurtosis2727.49197
Mean52.9729736
Median Absolute Deviation (MAD)6.418897485
Skewness51.95952231
Sum147794.5963
Variance1141896.544
MonotonicityNot monotonic
2022-03-26T12:00:42.829097image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.478333332
 
0.1%
4.1622
 
0.1%
18.152222221
 
< 0.1%
38.11661291
 
< 0.1%
26.087971011
 
< 0.1%
17.984615381
 
< 0.1%
30.881
 
< 0.1%
44.627692311
 
< 0.1%
14.362152781
 
< 0.1%
47.036730771
 
< 0.1%
Other values (2778)2778
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5048760331
< 0.1%
2.508371561
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7710052911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
2027.861
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
835.8641
< 0.1%
643.85857141
< 0.1%
6401
< 0.1%
624.41
< 0.1%

avg_recency_days
Real number (ℝ≥0)

Distinct45
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.020807817
Minimum1
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:43.164160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile1.081833333
Maximum3
Range2
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1172111288
Coefficient of variation (CV)0.1148219349
Kurtosis93.59539097
Mean1.020807817
Median Absolute Deviation (MAD)0
Skewness8.538125149
Sum2848.05381
Variance0.01373844871
MonotonicityNot monotonic
2022-03-26T12:00:43.904292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
12616
93.8%
219
 
0.7%
1.518
 
0.6%
1.215
 
0.5%
1.2515
 
0.5%
1.33333333314
 
0.5%
1.16666666714
 
0.5%
1.14285714310
 
0.4%
1.6666666675
 
0.2%
1.0714285714
 
0.1%
Other values (35)60
 
2.2%
ValueCountFrequency (%)
12616
93.8%
1.0212765961
 
< 0.1%
1.0270270271
 
< 0.1%
1.0285714291
 
< 0.1%
1.0305343511
 
< 0.1%
1.0307692311
 
< 0.1%
1.0357142862
 
0.1%
1.0384615381
 
< 0.1%
1.0428571431
 
< 0.1%
1.0434782611
 
< 0.1%
ValueCountFrequency (%)
32
 
0.1%
219
0.7%
1.8235294121
 
< 0.1%
1.6666666675
 
0.2%
1.518
0.6%
1.481
 
< 0.1%
1.4166666671
 
< 0.1%
1.43
 
0.1%
1.3751
 
< 0.1%
1.33333333314
0.5%

frequency
Real number (ℝ≥0)

SKEWED

Distinct1235
Distinct (%)44.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06310623622
Minimum0.005464480874
Maximum34
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:44.224869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005464480874
5-th percentile0.008771929825
Q10.01587301587
median0.02469135802
Q30.04255319149
95-th percentile0.1212022367
Maximum34
Range33.99453552
Interquartile range (IQR)0.02668017562

Descriptive statistics

Standard deviation0.66878999
Coefficient of variation (CV)10.5978431
Kurtosis2382.264166
Mean0.06310623622
Median Absolute Deviation (MAD)0.01099272789
Skewness47.28301617
Sum176.0663991
Variance0.4472800507
MonotonicityNot monotonic
2022-03-26T12:00:44.517080image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0714285714315
 
0.5%
0.0476190476214
 
0.5%
0.0285714285714
 
0.5%
0.030303030314
 
0.5%
0.0158730158714
 
0.5%
0.0238095238113
 
0.5%
0.0645161290313
 
0.5%
0.02512
 
0.4%
0.117647058812
 
0.4%
0.0384615384612
 
0.4%
Other values (1225)2657
95.2%
ValueCountFrequency (%)
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055096418731
 
< 0.1%
0.0056022408962
0.1%
0.0056179775281
 
< 0.1%
0.0056338028172
0.1%
0.0056818181821
 
< 0.1%
0.0056980056982
0.1%
0.0057142857143
0.1%
ValueCountFrequency (%)
341
 
< 0.1%
61
 
< 0.1%
41
 
< 0.1%
27
0.3%
1.51
 
< 0.1%
1.3333333332
 
0.1%
15
0.2%
0.66666666673
0.1%
0.56032171581
 
< 0.1%
0.54032258061
 
< 0.1%

qt_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct208
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.68960573
Minimum0
Maximum80995
Zeros1456
Zeros (%)52.2%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:44.836418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q39
95-th percentile102.1
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1570.684029
Coefficient of variation (CV)22.86640041
Kurtosis2530.258092
Mean68.68960573
Median Absolute Deviation (MAD)0
Skewness49.28702505
Sum191644
Variance2467048.319
MonotonicityNot monotonic
2022-03-26T12:00:45.135081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01456
52.2%
1145
 
5.2%
2121
 
4.3%
384
 
3.0%
473
 
2.6%
558
 
2.1%
658
 
2.1%
845
 
1.6%
1242
 
1.5%
741
 
1.5%
Other values (198)667
23.9%
ValueCountFrequency (%)
01456
52.2%
1145
 
5.2%
2121
 
4.3%
384
 
3.0%
473
 
2.6%
558
 
2.1%
658
 
2.1%
741
 
1.5%
845
 
1.6%
935
 
1.3%
ValueCountFrequency (%)
809951
< 0.1%
93611
< 0.1%
90141
< 0.1%
80601
< 0.1%
46271
< 0.1%
37681
< 0.1%
33351
< 0.1%
29751
< 0.1%
21601
< 0.1%
20221
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1002
Distinct (%)35.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.9785979
Minimum0.5
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:45.459212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile3.333333333
Q110
median17.21111111
Q327.9375
95-th percentile56.27333333
Maximum299.7058824
Range299.2058824
Interquartile range (IQR)17.9375

Descriptive statistics

Standard deviation18.76755425
Coefficient of variation (CV)0.8539013421
Kurtosis24.64773574
Mean21.9785979
Median Absolute Deviation (MAD)8.211111111
Skewness3.189408317
Sum61320.28815
Variance352.2210925
MonotonicityNot monotonic
2022-03-26T12:00:45.750968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1346
 
1.6%
1430
 
1.1%
1129
 
1.0%
928
 
1.0%
7.525
 
0.9%
125
 
0.9%
10.525
 
0.9%
9.525
 
0.9%
17.524
 
0.9%
15.524
 
0.9%
Other values (992)2509
89.9%
ValueCountFrequency (%)
0.52
 
0.1%
0.85714285711
 
< 0.1%
125
0.9%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.58
 
0.3%
1.5333333331
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
ValueCountFrequency (%)
299.70588241
< 0.1%
203.51
< 0.1%
1451
< 0.1%
136.1251
< 0.1%
135.51
< 0.1%
1221
< 0.1%
1181
< 0.1%
1141
< 0.1%
110.33333331
< 0.1%
1101
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1937
Distinct (%)69.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean244.8489954
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.9 KiB
2022-03-26T12:00:46.067306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44.575
Q1103.225
median171.8571429
Q3277.28125
95-th percentile590.275
Maximum40498.5
Range40497.5
Interquartile range (IQR)174.05625

Descriptive statistics

Standard deviation805.4131065
Coefficient of variation (CV)3.289427858
Kurtosis2240.24006
Mean244.8489954
Median Absolute Deviation (MAD)81.14285714
Skewness45.04601647
Sum683128.6973
Variance648690.2721
MonotonicityNot monotonic
2022-03-26T12:00:46.380721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
868
 
0.3%
828
 
0.3%
608
 
0.3%
1977
 
0.3%
647
 
0.3%
757
 
0.3%
143.57
 
0.3%
447
 
0.3%
1536
 
0.2%
Other values (1927)2714
97.3%
ValueCountFrequency (%)
11
< 0.1%
1.51
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
3684.476191
< 0.1%
28801
< 0.1%
2697.4657531
< 0.1%
2183.21
< 0.1%
2160.3333331
< 0.1%
2082.2258061
< 0.1%
20001
< 0.1%
1953.51
< 0.1%

Interactions

2022-03-26T12:00:33.606948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:48.931398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:52.948331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:56.514673image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:00.256754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:04.021866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:08.401105image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:13.685295image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:16.848750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:19.864226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:23.097123image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:26.144060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:30.119130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:33.860022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:49.199573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:53.192909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:56.814138image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:00.511762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:04.325380image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:09.064590image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:13.940185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:17.073470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:20.103683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:23.318017image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:26.372412image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:30.392050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:34.132055image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:49.532092image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:53.488859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:57.113544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:00.778755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:04.649821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:09.659782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:14.169257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:17.313848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:20.331104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:23.539797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:26.603178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:30.651430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:34.398909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:49.861608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:53.786573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:57.413938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:01.076256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:04.900970image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:09.936354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:14.403680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:17.542232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:20.570093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:23.765319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:26.827995image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:30.901956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:34.670542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:50.209257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:54.060497image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:57.681773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:01.350993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:05.219479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:10.244486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:14.662827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:17.771174image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:20.808918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:23.999417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:27.072798image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:31.159769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:34.948676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:50.490827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:54.326829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:57.945056image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:01.631745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:05.559617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:10.604088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:14.933159image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:18.003198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:21.067235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:24.253878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:27.335986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:31.436345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:35.213266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:50.730830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:54.617674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:58.172574image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:02.026505image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:05.865961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:11.134520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:15.149972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:18.211290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:21.291377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:24.468353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:27.564188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:31.672646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:35.478183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:50.994341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:54.872502image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:58.449075image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:02.376432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:06.184597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:11.417849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:15.384903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:18.447650image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:21.600477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:24.692474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:27.830316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:32.010031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:35.730238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:51.238886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:55.108705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:58.752666image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:02.638121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:06.505842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:11.698029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:15.624784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:18.666847image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:21.841576image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:24.918640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:28.077091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:32.261010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:36.024951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:51.523021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:55.379752image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:59.019028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:02.882776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:06.821920image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:12.031216image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:15.873262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:18.925886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:22.102858image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:25.160098image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:28.796784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:32.546902image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:36.301656image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:51.784436image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:55.625230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:59.269664image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:03.128314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:07.205653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:12.335581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:16.108989image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:19.151066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:22.345652image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:25.389184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:29.085451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:32.808497image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:36.593060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:52.046596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:55.882115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:59.594787image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:03.413581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:07.572246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:13.098279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:16.360843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:19.397367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:22.593115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:25.626157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:29.341815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:33.082544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:36.844705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:52.688055image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:56.189537image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T11:59:59.899199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:03.704344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:08.011422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:13.390688image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:16.599542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:19.623852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:22.833232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:25.853186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:29.817888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-26T12:00:33.341692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-03-26T12:00:46.751203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-26T12:00:47.119452image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-26T12:00:47.473270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-26T12:00:47.841098image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-26T12:00:37.348588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-26T12:00:37.929485image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqt_itemsqt_invoiceqt_productsavg_ticketavg_recency_daysfrequencyqt_returnsavg_unique_basket_sizeavg_basket_size
00178505391.2100372.00001733.000034.000021.000018.15221.000034.000040.00008.735350.9706
11130473237.540031.00001391.000010.0000105.000018.82291.00000.029236.000017.1000139.1000
22125837281.38002.00005060.000015.0000114.000029.47931.00000.040451.000015.4667337.3333
3313748948.250095.0000439.00005.000024.000033.86611.00000.01800.00005.600087.8000
4415100876.0000333.000080.00003.00001.0000292.00001.00000.075022.00001.000026.6667
55152914668.300025.00002103.000015.000061.000045.32331.00000.043129.00006.8000140.2000
66146885630.87007.00003621.000021.0000148.000017.21981.05260.0574399.000015.5714172.4286
77178095411.910016.00002057.000012.000046.000088.71981.00000.033642.00005.0833171.4167
881531160767.90000.000038194.000091.0000567.000025.54351.00000.2440474.000026.1429419.7143
99145278508.82002.00002089.000055.0000329.00008.75391.00000.149940.000017.654537.9818

Last rows

df_indexcustomer_idgross_revenuerecency_daysqt_itemsqt_invoiceqt_productsavg_ticketavg_recency_daysfrequencyqt_returnsavg_unique_basket_sizeavg_basket_size
2780580812784574.42009.0000300.00002.000053.00009.73591.00000.33330.000029.0000150.0000
278158101478577.400010.000084.00002.00002.000025.80001.00000.40000.00001.500042.0000
2782581117254272.44004.0000252.00002.0000100.00002.43251.00000.18180.000056.0000126.0000
2783582817232421.52002.0000203.00002.000030.000011.70891.00000.16670.000018.0000101.5000
2784582917468137.000010.0000116.00002.00005.000027.40001.00000.50000.00002.500058.0000
2785584013596697.04005.0000406.00002.0000133.00004.19901.00000.28570.000083.0000203.0000
27865846148931237.85009.0000799.00002.000072.000016.95681.00001.00000.000036.5000399.5000
2787587114126706.13007.0000508.00003.000014.000047.07531.00001.000050.00005.0000169.3333
27885877135211093.65001.0000736.00003.0000312.00002.50841.00000.33330.0000145.0000245.3333
2789588715060303.09008.0000263.00004.000080.00002.50491.00004.00000.000030.000065.7500